An Evaluation Framework Based on Gold Standard Models for ..
نویسندگان
چکیده
This paper presents a weak supervised evaluation framework for definition question answering (DefQA) called Solon. It automatically evaluates a set of DefQA systems using existing human definitions as gold standard models. This allows the framework to overcome known limitations of the evaluation methods in the state of the art with the advantage that it is less supervised. In addition, Solon adapts its configuration for each specific DefQA task, thus rendering a good evaluation procedure. The results obtained in our experiments show that Solon is able to detect the best systems and to score them accordingly, with state of the art performance.
منابع مشابه
Contrastive analysis of diagnostic tests evaluation without gold stand-ard: review article
Considering the advancement of medical sciences, diagnostic tests have been developed to distinguish patients from healthy population. Therefore, Determining and evaluation of the diagnostic accuracy tests is of great importance. The accuracy of a test under evaluation is determined through the amount of agreement between its results with the results of the gold standard, and this test accuracy...
متن کاملA Practical Self-Assessment Framework for Evaluation of Maintenance Management System based on RAMS Model and Maintenance Standards
A set of technical, administrative and management activities are done in the life cycle of equipment, to be located in good condition and have proper and expected functioning. This is refers to be, maintenance management system (MMS). The framework and models of assessment in order to enhance effectiveness of a MMS could be proposed in two categories: qualitative and quantitative. In this resea...
متن کاملPortfolio Performance Evaluation in a Modified Mean-Variance-Skewness Framework with Negative Data
The present study is an attempt toward evaluating the performance of portfolios using mean-variance-skewness model with negative data. Mean-variance non-linear framework and mean-variance-skewness non- linear framework had been proposed based on Data Envelopment Analysis, which the variance of the assets had been used as an input to the DEA and expected return and skewness were the output. C...
متن کاملFind the word that does not belong: A Framework for an Intrinsic Evaluation of Word Vector Representations
We present a new framework for an intrinsic evaluation of word vector representations based on the outlier detection task. This task is intended to test the capability of vector space models to create semantic clusters in the space. We carried out a pilot study building a gold standard dataset and the results revealed two important features: human performance on the task is extremely high compa...
متن کاملData envelopment analysis in service quality evaluation: an empirical study
Service quality is often conceptualized as the comparison between service expectations and the actual performance perceptions. It enhances customer satisfaction, decreases customer defection, and promotes customer loyalty. Substantial literature has examined the concept of service quality, its dimensions, and measurement methods. We introduce the perceived service quality index (PSQI) as a sing...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006